perm filename COMMON.TEX[E83,JMC]1 blob
sn#781381 filedate 1984-12-27 generic text, type C, neo UTF8
COMMENT â VALID 00005 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 \input memo.tex[let,jmc]
C00003 00003
C00035 00004 %.<<contents of oral presentation
C00037 00005 \noindent{\bf Answers to questions:}
C00053 ENDMK
Câ;
\input memo.tex[let,jmc]
\centerline{\bf SOME EXPERT SYSTEMS NEED COMMON SENSE}
\centerline{\bf John McCarthy, Stanford University}
\centerline{\bf Copyright 1983, John McCarthy}
giantskip
\whentexed{common.tex[e83,jmc]}
\bigskip\bigskip\bigskip
An {\it expert system} is a computer program intended to embody the
knowledge and ability of an expert in a certain domain. The ideas behind
them and several examples have been described in other lectures in this
symposium. Their performance in their specialized domains are often
very impressive. Nevertheless, hardly any of them have certain
{\it common sense} knowledge and ability possessed by any non-feeble-minded
human. This lack makes them ``brittle''. By this is meant that they are
difficult to extend beyond the scope originally contemplated by their
designers, and they usually don't recognize their own limitations.
Many important applications will require common sense abilities. The object
of this lecture is to describe common sense
abilities and the problems that require them.
Common sense
facts and methods are only very partially understood today, and
extending this understanding is the key problem facing artificial
intelligence.
This isn't exactly a new point of view.
I have been advocating ``Computer Programs with Common
Sense'' since I wrote a paper with that title in 1958. Studying common
sense capability has sometimes been popular and sometimes unpopular
among AI researchers. At present it's popular, perhaps
because new AI knowledge offers new hope of progress.
Certainly AI researchers today
know a lot more about what common sense is than I knew in 1958 ---
or in 1969 when I wrote another paper on the subject.
However, expressing common sense knowledge in formal terms
has proved very difficult, and the number of scientists
working in the area is still far too small.
One of the best known expert systems is
Mycin (Shortliffe 1976; Davis, Buchanan and Shortliffe 1977),
a program for advising physicians on treating bacterial
infections of the blood and meningitis.
It does reasonably well without common sense, provided
the user has common sense and understands the program's limitations.
Mycin conducts a question and answer dialog.
After asking basic facts about the patient such
as name, sex and age, Mycin asks about suspected bacterial
organisms, suspected sites of infection, the presence of specific
symptoms (e.g. fever, headache) relevant to diagnosis, the outcome
of laboratory tests, and some others. It then recommends a certain
course of antibiotics. While the dialog
is in English, Mycin avoids having to understand freely written
English by controlling the dialog. It outputs sentences, but
the user types only single words or standard phrases. Its major
innovations over many previous expert systems were that it
uses measures of uncertainty (not probabilities) for its
diagnoses and the fact that it is prepared to explain its
reasoning to the physician, so he can decide whether to
accept it.
Our discussion of Mycin begins with its ontology.
The ontology of a program
is the set of entities that its variables
range over. Essentially this is what it can have information about.
Mycin's ontology includes bacteria, symptoms, tests, possible
sites of infection, antibiotics and
treatments. Doctors, hospitals, illness and death are absent.
Even patients are not really part of the ontology, although
Mycin asks for many facts about the specific patient. This is because
patients aren't values of variables, and Mycin never compares the
infections of two different patients. It would therefore be difficult
to modify Mycin to learn from its experience.
Mycin's program, written in a general scheme called Emycin,
is a so-called {\it production system}. A production system is a collection
of rules, each of which has two parts --- a pattern part and an
action part. When a rule is activated, Mycin tests whether the
pattern part matches the database. If so this results in the
variables in the pattern being matched to whatever entities are
required for the match of the database. If not the pattern
fails and Mycin tries another. If the match is successful, then
Mycin performs the action part of the pattern using the values
of the variables determined by the pattern part.
The whole process of questioning and recommending is built up
out of productions.
The production formalism turned out to be suitable
for representing a large amount of information about the
diagnosis and treatment of bacterial infections. When Mycin
is used in its intended manner it scores better than
medical students or interns or practicing physicians and
on a par with experts in bacterial diseases when the latter
are asked to perform in the same way. However, Mycin
has not been put into production use, and the reasons given
by experts in the area varied when I asked whether it would
be appropriate to sell Mycin cassettes to doctors wanting
to put it on their micro-computers.
Some said it would be ok if there were a means of keeping
Mycin's database current with new discoveries in the field,
i.e. with new tests, new theories, new diagnoses and new
antibiotics. For example, Mycin would have to be told
about Legionnaire's disease and the associated {\it Legionnella}
bacteria
which became understood only
after Mycin was finished. (Mycin is very stubborn about
new bacteria, and simply replies ``unrecognized response'')
Others say that Mycin is not even close to usable except
experimentally, because
it doesn't know its own limitations.
I suppose this is partly
a question of whether the doctor using Mycin is trusted
to understand the documentation about its limitations.
Programmer's always develop the idea that the
users of their programs are idiots, so the opinion
that doctors aren't smart enough not to be misled by Mycin's limitations
may be at least partly a consequence of this ideology.
An example of Mycin not knowing its limitations can
be excited by telling Mycin that the patient has {\it Cholerae Vibrio}
in his intestines. Mycin will cheerfully recommend two weeks of
tetracycline and nothing else.
Presumably this would indeed kill the bacteria, but
most likely the patient will be dead of cholera long before that.
However, the physician will presumably know that the diarrhea has
to be treated and look elsewhere for how to do it.
On the other hand it may be really true that some measure
of common sense is required for usefulness even in this narrow
domain. We'll list some areas of common sense knowledge
and reasoning ability and also apply the criteria to
Mycin and other hypothetical programs operating in Mycin's domain.
\bigskip\bigskip\bigskip
\noindent{\bf WHAT IS COMMON SENSE?}
Understanding common sense capability is now a hot area
of research in artificial intelligence, but there is not yet any
consensus. We will try to divide common sense capability into common
sense knowledge and common sense reasoning, but even this cannot
be made firm. Namely, what one man builds as a reasoning method
into his program, another
can express as a fact using a richer ontology. However, the latter
can have problems in handling in a good way the generality he has
introduced.
\bigskip\bigskip\bigskip
\noindent{\bf COMMON SENSE KNOWLEDGE}
We shall discuss various areas of common sense knowledge.
1. The most salient common sense knowledge concerns
situations that change in time as a result of events. The most
important events are actions, and for a program to plan intelligently,
it must be able to determine the effects of its own actions.
Consider the Mycin domain as an example. The situation with which
Mycin deals includes the doctor,
the patient and the illness. Since Mycin's actions are
advice to the doctor, full planning would
have to include information about the effects of Mycin's output on what
the doctor will do. Since Mycin doesn't know about the doctor, it might
plan the effects of the course of treatment on the patient. However, it
doesn't do this either. Its rules give the recommended treatment as a function
of the information elicited about the patient, but Mycin makes no
prognosis of the effects of the treatment. Of course, the doctors who
provided the information built into Mycin considered the effects of the
treatments.
Ignoring prognosis
is possible because of the specific narrow domain in which
Mycin operates. Suppose, for example, a certain antibiotic had
the precondition for its usefulness that the patient not have a fever.
Then Mycin might have to make a plan for getting rid of the patient's
fever and verifying that it was gone as a part of the plan for using
the antibiotic. In other domains, expert systems and other
AI programs have to make plans, but Mycin doesn't. Perhaps if I knew
more about bacterial diseases, I would conclude that their treatment
sometimes really does require planning and that lack of planning
ability limits Mycin's utility.
The fact that Mycin doesn't give a prognosis is certainly
a limitation. For example, Mycin cannot be asked on behalf of the
patient or the administration of the hospital when the patient is
likely to be ready to go home. The doctor who uses Mycin must do
that part of the work himself. Moreover, Mycin cannot answer a
question about a hypothetical treatment, e.g. ``What will happen
if I give this patient penicillin?'' or even ``What bad things might
happen if I give this patient penicillin?''.
2. Various formalisms are used in artificial intelligence
for representing facts about the effects of actions and other
events. However, all systems that
I know about give the effects of an event in a situation by
describing a new situation that results from the event.
This is often enough, but it doesn't cover the important case
of concurrent events and actions. For example, if a patient
has cholera, while the antibiotic is killing the cholera bacteria,
the damage to his intestines is causing loss of fluids that are
likely to be fatal. Inventing a formalism that will conveniently
express people's common sense knowledge about concurrent events
is a major unsolved problem of AI.
3. The world is extended in space and is occupied by objects
that change their positions and are sometimes created and destroyed.
The common sense facts about this are difficult to express but
are probably not important in the Mycin example. A major difficulty
is in handling the kind of partial knowledge people ordinarily have.
I can see part of the front of a person in the audience, and my
idea of his shape uses this information to approximate his total shape.
Thus I don't expect him to stick out two feet in back even though
I can't see that he doesn't. However, my idea of the shape of his
back is less definite than that of the parts I can see.
4. The ability to represent and use knowledge about knowledge
is often required for intelligent behavior.
What airline flights there are to Singapore is recorded in the
issue of the International Airline Guide current for
the proposed flight
day. Travel agents know how to book airline flights and can
compute what they cost. An advanced Mycin might need to reason that
Dr. Smith knows about cholera, because
he is a specialist in tropical medicine.
5. A program that must co-operate or compete with people
or other programs must be able to represent information about
their knowledge, beliefs, goals, likes and dislikes, intentions
and abilities.
An advanced Mycin might need to know that a patient won't take a
bad tasting medicine unless he is convinced of its necessity.
6. Common sense includes much knowledge whose domain overlaps
that of the exact sciences but differs from it epistemologically.
For example,
if I spill the glass of water on the podium, everyone knows that
the glass will break and the water will spill. Everyone knows that
this will take a fraction of a second and that the water will not
splash even ten feet. However, this information is not obtained by
using the formula for a falling body or the Navier-Stokes equations
governing fluid flow. We don't have the input data for the equations,
most of us don't know them, and we couldn't integrate them fast
enough to decide whether to jump out of the way. This common
sense physics is contiguous with scientific physics. In fact
scientific physics is imbedded in common sense physics, because
it is common sense physics that tells us what the equation
%
$$s = {1\over 2} g tâ2$$
%
means.
If Mycin were extended to be a robot physician it would have to know
common sense physics and maybe also some scientific physics.
It is doubtful that the facts of the common sense world can
be represented adequately by production rules. Consider the fact that when two
objects collide they often make a noise. This fact can be used to make
a noise, to avoid making a noise, to explain a noise or to explain the
absence of a noise. It can also be used in specific situations involving
a noise but also to understand general phenomena, e.g. should an intruder
step on the gravel, the dog will hear it and bark. A production rule
embodies a fact only as part of a specific procedure.
Typically they match facts about specific objects, e.g. a specific
bacterium, against a general rule and get a new fact about those objects.
Much present AI research concerns how to represent facts in
ways that permit them to be used for a wide variety of purposes.
\bigskip\bigskip\bigskip
\noindent{\bf COMMON SENSE REASONING}
Our ability to use common sense knowledge depends on being
able to do common sense reasoning.
Much artificial intelligence inference is not designed
to use directly the rules of inference of any of the well known
systems of mathematical logic. There is often no
clear separation in the program between determining what inferences
are correct and the strategy for finding the inferences required
to solve the problem at hand. Nevertheless, the logical system usually
corresponds to a subset of first order logic. Systems provide for
inferring a fact
about one or two particular objects from other facts about these
objects and a general rule containing variables. Most expert systems,
including Mycin, never
infer general statements, i.e. quantified formulas.
Human reasoning also involves obtaining facts by observation
of the world, and computer programs also do this. Robert Filman
did an interesting thesis on observation in a chess world where
many facts that could be obtained by deduction are in fact obtained
by observation. Mycin's doesn't require this, but our hypothetical
robot physician would have to draw conclusions from a patient's
appearance, and computer vision is not ready for it.
An important new development in AI (since the middle 1970s) is
the formalization of non-monotonic reasoning.
Deductive reasoning in mathematical logic has the following
property --- called monotonicity by analogy with similar mathematical
concepts. Suppose we have a set of assumptions from which follow
certain conclusions. Now suppose we add additional assumptions.
There may be some new conclusions, but every sentence that was
a deductive consequence of the original hypotheses is still a
consequence of the enlarged set.
Ordinary human reasoning does not share this monotonicity
property. If you know that I have a car, you may conclude that
it is a good idea to ask me for a ride. If you then learn that
my car is being fixed (which does not contradict what you knew
before), you no longer conclude that you can get a ride. If you
now learn that the car will be out in half an hour you reverse
yourself again.
Several artificial intelligence researchers, for example
Marvin Minsky (1974) have pointed out that intelligent computer
programs will have to reason non-monotonically. Some concluded
that therefore logic is not an appropriate formalism.
However, it has turned out that deduction in mathematical
logic can be supplemented by additional modes of non-monotonic
reasoning, which are just as formal as deduction and just as
susceptible to mathematical study and computer implementation.
Formalized non-monotonic reasoning turns out to give certain
rules of conjecture rather than rules of inference --- their
conclusions are appropriate, but may be disconfirmed when
more facts are obtained. One such method is {\it circumscription},
described in (McCarthy 1980).
A mathematical description of circumscription is beyond the
scope of this lecture, but the general idea is straightforward. We have
a property applicable to objects or a relation applicable to pairs
or triplets, etc. of objects. This property or relation is constrained
by some sentences taken as assumptions, but there is still some freedom left.
Circumscription further constrains the property or relation by
requiring it to be true of a minimal set of objects.
As an example, consider representing the facts about whether an
object can fly in a database of common sense knowledge. We could try
to provide axioms that will determine whether each kind of object can
fly, but this would make the database very large. Circumscription allows
us to express the assumption that only those objects can fly for which
there is a positive statement about it. Thus there will be positive
statements that birds and airplanes can fly and no statement that
camels can fly. Since we don't include negative statements in the
database, we could provide for flying camels, if there were any, by
adding statements without removing existing statements. This much
is often done by a simpler method --- the {\it closed world assumption}
discussed by Raymond Reiter. However, we also have exceptions to the
general statement that birds can fly. For example, penguins, ostriches
and birds with certain feathers removed can't fly. Moreover, more
exceptions may be found and even exceptions to the exceptions.
Circumscription allows us to make the
known exceptions and to provide for additional exceptions to be added
later --- again without changing existing statements.
Non-monotonic reasoning also seems to be involved in human
communication. Suppose I hire you to build me a bird cage,
and you build it without a top, and I refuse to pay on the grounds
that my bird might fly away. A judge will side with me. On the other
hand suppose you build it with a top, and I refuse to pay full price
on the grounds that my bird is a penguin, and the top is a waste. Unless
I told you that my bird couldn't fly, the judge will side with you.
We can therefore regard it as a communication convention that if a
bird can fly the fact need not be mentioned, but if the bird can't
fly and it is relevant, then the fact must be mentioned.
\eject
\noindent{\bf References:}
\bigskip
{\bf Davis, Randall; Buchanan, Bruce; and Shortliffe, Edward (1977)}:
``Production Rules as a Representation for a Knowledge-Based Consultation
Program,'' {\it Artificial Intelligence}, Volume 8, Number 1, February.
{\bf McCarthy, John (1960)}: ``Programs with Common Sense,'' Proceedings of the
Teddington Conference on the Mechanization of Thought Processes, Her Majesty's
Stationery Office, London.
{\bf McCarthy, John and P.J. Hayes (1969)}: ``Some Philosophical Problems from
the Standpoint of Artificial Intelligence'', in D. Michie (ed), {\it Machine
Intelligence 4}, American Elsevier, New York, NY.
{\bf McCarthy, John (1980)}:
``Circumscription --- A Form of Non-Monotonic Reasoning'', {\it Artificial
Intelligence}, Volume 13, Numbers 1,2, April.
%.<<aim 334, circum.new[s79,jmc]>>
{\bf Minsky, Marvin (1974)}:
``A Framework for Representing Knowledge'', {\it M.I.T. AI Memo 252}.
{\bf Shortliffe, Edward H. (1976)}:
Computer-Based Medical Consultations: MYCIN, American Elsevier, New York, NY.
\eject
%.<<contents of oral presentation
%.
%.*reasoning systems
%.*relation between science and common sense
%.*production rules, Prolog or Microplanner, full reasonign
%.*from neutral facts
%.*non-monotonic reasoning
%.*bird cage
%.
%.*grumble about not enough work on common sense
%.
%.-grumble about Barwise and Perry
%.
%.metaphysical and epistemological adequacy
%.
%
%
%.``Place the control near the bed in a place that is neither hotter nor
%.colder than the room itself. If the control is placed on a radiator or
%.radiant heated floors, it will ``think'' the entire room is hot and will
%.lower your blanket temperature, making your bed too cold. If the control
%.is placed on the window sill in a cold draft, it will ``think'' the entire
%.room is cold and will heat up your bed so it will be too hot.'' --- from the
%.instructions to an electric blanket.
%.
%.
%.ambiguity tolerance
%.>>
\noindent{\bf Answers to questions:}
1. I could have made this a defensive talk about artificial intelligence,
but I chose to emphasize the problems that have been identified rather
than the progress that has been made in solving them. Let me remind you
that I have argued that the need for common sense is not a truism.
Many useful things can be done without it, e.g. Mycin and also
chess programs.
2. Consider your 20 years. If anyone had known in 1963 how to make
a program learn from its experience to do what a human does after 20
years, they might have done it, and it might be pretty smart by now.
Already in 1958 there had been work on programs that learn from
experience. However, all they could learn was to set optimal
values of numerical parameters in the program, and they were
quite limited in their ability to do that. Arthur Samuel's checker
program learned optimal values for its parameters, but the
problem was that certain kinds of desired behavior did not correspond
to any setting of the parameters, because it depended on the recognition
of a certain kind of strategic situation. Thus the first prerequisite
for a program to be able to learn something is that it be able to
represent internally the desired modification of behavior. Simple
changes in behavior must have simple representations. Turing's
universality theory convinces us that arbitrary behaviors can be
represented, but they don't tell us how to represent them in
such a way that a small change in behavior is a small change
in representation. Present methods of changing programs amount
to education by brain surgery.
3. In the use of Mycin, the physician is supposed to supply the
common sense. The question is whether the program must also
have common sense, and I would say that the answer is not clear
in the Mycin case. Purely computational programs don't require
common sense, and none of the present chess programs have any. On
the other hand, it seems clear that many other kinds of programs
require common sense to be useful at all.
\eject
\noindent{\bf PANEL DISCUSSION}
McCarthy: The question is whether AI has illuminated human intelligence,
and I think the answer is obviously yes. AI and psychology
influenced by AI are responsible for destroying behaviorism as
a serious approach to psychology and turning psychologists towards
information processing models. Presumably a psychologist would
be more competent to speak about that influence than an AI person.
Now I want to deal with the issue about whether a machine
really thinks or believes. This is an elaboration of a point I
made in my lecture. Namely, we will find it necessary to use
mentalistic terminology in describing what we know about machines.
Of course, if we understand how a thermostat works, we don't have
to adopt the mentalistic stance of saying that the thermostat
thinks the room is too warm.
Indeed I picked the thermostat example, precisely because
we can understand it both ways --- mechanistically and mentalistically.
Just because we can understand its mechanism is not a reason to
bar the use of mentalistic terms. There's an illuminating analogy
with the number system and its historical development. Suppose
someone said that he didn't think that one is a number --- arguing
that if you have only one thing you don't have to count. Indeed
most languages treat one differently from the other numbers. Some
treat two differently also, and in Russian numbers up to four take
the genitive case. The introduction of zero to the number system
is even more recent, and I believe it was controversial. The justification
is that the number system as a system makes more sense if both
zero and one are included. Likewise, a systematic treatment of
belief by machines will have to start with the easy cases.
A more complex case arises when we say that a dog wants
to go out. We cannot practically reduce this to a propensity
to behave in a certain way, because we may not know what the
dog will do to further this desire. It may scratch the door or
yelp or whatever. Secondly, we may not know the evidence that
the dog wants to go out. Therefore, the fact that the dog
wants to go out is best treated as primary.
Another useful idea comes from Dan Dennett --- the notion
of the ``design stance''. Suppose we are designing a dog
as an artificial intelligence. It will be convenient to design
in the desire to go out as a possible state. We have a variety
of choices as to what will cause this state and what our dog
will do to realize the desire. In designing computer systems,
we will also find this notion of %2wanting%1 a useful intermediate
point.
As far as I can see, the purely intellectual terms are
easier to handle for machines than some of the emotional terms.
``It believes'' is easier than ``it hopes'', which is easier than
``it likes me'' or ``it doesn't like me''. And as to whether the machine
is suffering, all I can say is that it complains a lot.
When we ask whether it is conscious, there are a lot of
criteria for saying no. No, because it doesn't know about its
physical body. No, it doesn't even refer to itself. On the
other hand it might claim to be alienated, but it has just
read Marcuse. Well that's how most people who claim to be
alienated come to claim it. It's something they read about.
McCarthy: First of all the Chinese room. There is confusion
between the system consisting of the person and the person.
I agree with Robert Wilensky who made the same point earlier.
The system knows Chinese, but the person who is executing the
system may not. This is analogous to an interpreter running
in a computer; the interpreted program often has capabilities
the interpreter does not. It's just that we don't have experience
with systems in which a person carries out a mental process
that has properties different from those of the person himself.
We get the same confusion with computers. Someone asks me
whether LISP can do calculus problems. No, LISP cannot do
calculus, but some LISP programs can.
The example of thirst is different. A program that
simulates thirst is not going to be thirsty. For example,
there is no way to relieve it with real water.
Searle has said that the people in AI take the third
person view of mental qualities. That's correct. We do,
and we'll claim that it's a virtue. He says we consider the
problem of intelligence as distinct from biology. Yes, we hold
that intelligence is something that can be dealt with abstractly
just as computation can be discussed and dealt with abstractly.
One can ask whether a computer calculates the sum of 3 and 5
to be 8 in the same sense as a human does. I suppose Searle
would agree that ``calculate'' is being used in the same sense
for both human and machine in this case.
Now there's the point Dreyfus made about it taking
300 years. I have been saying that human level AI will take
between 5 and 500 years. The problem isn't that it will take
a long time to enter data into a computer. Its rather that
conceptual advances are required before we can implement human
level artificial intelligence --- just as conceptual advances
were required beyond the situation in 1900 before we could
have nuclear energy.
Pursuing the nuclear energy analogy, the question is
whether the present AI situation corresponds to 1900 or to
1938 when Rutherford, the leading nuclear physicist, declared
nuclear energy impossible. The situation of 1938 is interesting
in that experiments exhibiting nuclear fission had already been
done but had been misinterpreted. Perhaps someone has already
done experimental research which, when properly interpreted,
will make possible human level AI. I would be very surprised. When
we talk about future conceptual advances, we don't know where
we stand at present.
Bert made a point about a reservation machine not
knowing whether 6:25 will do as a little earlier than 6:30.
The program would have the same problem if it were making
a reservation for a robot. Whether even a 6:29 reservation
will do depends on circumstances. So the fact that the
reservation is for humans isn't the problem.
Finally, let me defend Searle on one point. He was discussing
whether a computer can think in the same sense as a human ---
not does it think in the same way.
In my opinion the thermostat thinks the room is too warm
in the same sense as a human might, and he would disagree.
Likewise about whether the dog simulation wants to go out.
\noindent ***
McCarthy: The question is what will be the situation five to ten
years from now; let me make it ten or fifteen. I think there'll
be a paradigm shift among the public that
will give John Searle the following problem. He will want to
come to the symposium to correct our use of mental terms, but
he won't even get here, because he'll have to correct his secretary
who will tell him, ``It promised to process your travel advance,
but I don't think it will, because it's puzzled about whether
the expenditure for flowers was intended and necessary for
the business's goals''.
Thus in ten or fifteen years, quite mundane systems
used for business and personal purposes will require the use
of a certain amount of mental terminology in order to use them
effectively.
Also let me repeat my warning to philosophers that if
they insist on discussing common sense reasoning only at the
general level of today's discussion, they will lose the
jurisdiction. We need to consider the conditions for the
ascription of particular mental qualities, and this may
require collaboration among philosophers and artificial intelligence
researchers.
We attempted such a collaboration several years ago, but
I think the particular attempt was unsuccessful largely because
it considered overly general questions. This was partly because
the AI people succumbed to the temptation to become amateur
philosophers rather than raising the AI issues to which
philosophy is relevant.